Tugas Hands-on Pemrosesan Audio¶
Deadline : Deadline: Jumat, 17 Oktober 2025, 23.59 WIB
Submission Link : https://tally.so/r/wLeEXJ
Data diri:
Muhammad Yusuf
122140193
# === Import all required libraries ===
import os
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
import soundfile as sf
from scipy import signal
from scipy.signal import butter, filtfilt
from pydub import AudioSegment
from pydub.silence import split_on_silence
import pyloudnorm as pyln
import IPython.display as ipd
Soal 1: Rekaman dan Analisis Suara Multi-Level¶
# ==========================================================
# JAWABAN SOAL 1 - Audio Loading, Visualization, and Resampling
# ==========================================================
# ========================================
# 1. Load Audio
# ========================================
base_dir = Path.cwd()
audio_path = base_dir / "data" / "soal1.wav"
print("=" * 50)
print("LOAD AUDIO")
print("=" * 50)
print(f"Audio Path: {audio_path}")
# Load audio dengan sample rate asli (tanpa resampling otomatis)
y, sr = librosa.load(audio_path, sr=None)
print(f"Sample Rate: {sr} Hz")
print(f"Audio Duration: {librosa.get_duration(y=y, sr=sr):.2f} seconds")
print(f"Number of Samples: {len(y)}")
print(f"Channels: {'Mono' if len(y.shape) == 1 else 'Stereo'}")
print("=" * 50)
# ========================================
# 2. Plot Waveform dan Spectrogram (Original)
# ========================================
print("\nMenampilkan waveform dan spectrogram (original)...")
plt.figure(figsize=(16, 8))
# Waveform
plt.subplot(2, 1, 1)
librosa.display.waveshow(y, sr=sr)
plt.title('Waveform (Original)')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
# Spectrogram
plt.subplot(2, 1, 2)
D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(D, sr=sr, x_axis='time', y_axis='hz')
plt.colorbar(format='%+2.0f dB')
plt.title('Spectrogram (Original)')
plt.tight_layout()
plt.show()
# ========================================
# 3. Resampling Process
# ========================================
print("\n" + "=" * 50)
print("RESAMPLING PROCESS")
print("=" * 50)
target_sr = 16000
y_resampled = librosa.resample(y, orig_sr=sr, target_sr=target_sr)
print(f"Original Sample Rate : {sr} Hz")
print(f"Target Sample Rate : {target_sr} Hz")
print(f"Original Duration : {librosa.get_duration(y=y, sr=sr):.2f} s")
print(f"Resampled Duration : {librosa.get_duration(y=y_resampled, sr=target_sr):.2f} s")
# Playback
print("\n(Resampled audio tidak disimpan ke file, hanya ditampilkan untuk playback.)")
# ========================================
# 4. Playback Audio
# ========================================
print("\n" + "=" * 50)
print("AUDIO PLAYBACK")
print("=" * 50)
print("▶ Original Audio:")
display(ipd.Audio(y, rate=sr))
print("\n▶ Resampled Audio:")
display(ipd.Audio(y_resampled, rate=target_sr))
# ========================================
# 5. Comparison Plot (Before vs After Resampling)
# ========================================
print("\nMenampilkan perbandingan waveform dan spectrogram...")
plt.figure(figsize=(16, 8))
# Waveform Original
plt.subplot(2, 2, 1)
librosa.display.waveshow(y, sr=sr)
plt.title(f'Waveform (Original) - SR: {sr} Hz')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
# Spectrogram Original
plt.subplot(2, 2, 2)
D_orig = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(D_orig, sr=sr, x_axis='time', y_axis='hz')
plt.colorbar(format='%+2.0f dB')
plt.title('Spectrogram (Original)')
# Waveform Resampled
plt.subplot(2, 2, 3)
librosa.display.waveshow(y_resampled, sr=target_sr)
plt.title(f'Waveform (Resampled) - SR: {target_sr} Hz')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
# Spectrogram Resampled
plt.subplot(2, 2, 4)
D_resamp = librosa.amplitude_to_db(np.abs(librosa.stft(y_resampled)), ref=np.max)
librosa.display.specshow(D_resamp, sr=target_sr, x_axis='time', y_axis='hz')
plt.colorbar(format='%+2.0f dB')
plt.title('Spectrogram (Resampled)')
plt.tight_layout()
plt.show()
================================================== LOAD AUDIO ================================================== Audio Path: c:\Users\muham\OneDrive\Desktop\sistem-teknologi-multimedia-122140193\worksheet3\data\soal1.wav Sample Rate: 44100 Hz Audio Duration: 26.35 seconds Number of Samples: 1161888 Channels: Mono ================================================== Menampilkan waveform dan spectrogram (original)...
================================================== RESAMPLING PROCESS ================================================== Original Sample Rate : 44100 Hz Target Sample Rate : 16000 Hz Original Duration : 26.35 s Resampled Duration : 26.35 s (Resampled audio tidak disimpan ke file, hanya ditampilkan untuk playback.) ================================================== AUDIO PLAYBACK ================================================== ▶ Original Audio:
▶ Resampled Audio:
Menampilkan perbandingan waveform dan spectrogram...
Analysis Section Soal 1¶
Berdasarkan hasil visualisasi waveform dan spectrogram, tidak tampak perbedaan mencolok antara kelima segmen suara (pelan, normal, keras, cempreng, berteriak) yang seharusnya memiliki variasi amplitudo dan kepadatan frekuensi berbeda. Hal ini kemungkinan disebabkan oleh kesalahan perekaman, seperti posisi mikrofon yang terlalu dekat atau fitur auto-gain yang menstabilkan volume sehingga perubahan intensitas tidak terekam secara proporsional. Akibatnya, bagian berbisik terdengar lebih keras dari seharusnya dan bagian berteriak tampak terkompres. Setelah dilakukan resampling dari 44100 Hz ke 16000 Hz, bentuk sinyal dan distribusi frekuensi tetap hampir sama, hanya terjadi penurunan batas frekuensi maksimum yang membuat suara terdengar lebih lembut dan kurang tajam dibanding versi aslinya.
Soal 2: Noise Reduction dengan Filtering¶
# ==========================================================
# JAWABAN SOAL 2 - Noise Reduction with Filtering
# ==========================================================
# ========================================
# 1. Load Audio
# ========================================
base_dir = Path.cwd()
audio_path = base_dir / "data" / "soal2.wav"
y, sr = librosa.load(audio_path, sr=None)
print("="*50)
print("AUDIO INFORMATION")
print("="*50)
print(f"File Path : {audio_path}")
print(f"Sample Rate : {sr} Hz")
print(f"Duration : {librosa.get_duration(y=y, sr=sr):.2f} s")
print(f"Samples : {len(y)}")
print(f"Channels : {'Mono' if len(y.shape) == 1 else 'Stereo'}")
print("="*50)
# ========================================
# 2. Original Audio Visualization
# ========================================
print("\nMenampilkan waveform dan spectrogram (original)...")
# Membuat figure dengan 1 baris dan 2 kolom
fig, axes = plt.subplots(1, 2, figsize=(14, 4))
# --- Waveform ---
librosa.display.waveshow(y, sr=sr, ax=axes[0])
axes[0].set_title("Original Audio Waveform")
axes[0].set_xlabel("Time (s)")
axes[0].set_ylabel("Amplitude")
# --- Spectrogram ---
S = np.abs(librosa.stft(y))
S_db = librosa.amplitude_to_db(S, ref=np.max)
img = librosa.display.specshow(S_db, sr=sr, x_axis='time', y_axis='log', cmap='magma', ax=axes[1])
axes[1].set_title("Original Audio Spectrogram")
fig.colorbar(img, ax=axes[1], format='%+2.0f dB')
plt.tight_layout()
plt.show()
print("\n" + "="*50)
print("ORIGINAL AUDIO PLAYBACK")
print("="*50)
display(ipd.Audio(y, rate=sr))
# ========================================
# 3. Filter Function
# ========================================
def butter_filter(y, sr, btype, cutoff):
"""General Butterworth filter for low, high, or band type"""
nyq = 0.5 * sr
if btype == 'band':
low, high = np.array(cutoff) / nyq
b, a = butter(5, [low, high], btype='band')
else:
normal_cutoff = cutoff / nyq
b, a = butter(5, normal_cutoff, btype=btype)
return filtfilt(b, a, y)
# ========================================
# 4. Apply Filters (Default Settings)
# ========================================
print("\n" + "="*50)
print("FILTER APPLICATION (Default Cutoffs)")
print("="*50)
low_pass = butter_filter(y, sr, 'low', 1000)
high_pass = butter_filter(y, sr, 'high', 1000)
band_pass = butter_filter(y, sr, 'band', (300, 3000))
# Playback hasil filter
print("\n▶ Low-Pass Filter (1000 Hz)")
display(ipd.Audio(low_pass, rate=sr))
print("\n▶ High-Pass Filter (1000 Hz)")
display(ipd.Audio(high_pass, rate=sr))
print("\n▶ Band-Pass Filter (300–3000 Hz)")
display(ipd.Audio(band_pass, rate=sr))
# ========================================
# 5. Filter Comparison Visualization
# ========================================
fig, axes = plt.subplots(3, 2, figsize=(16, 8))
filters = {
"Low-Pass (1000 Hz)": low_pass,
"High-Pass (1000 Hz)": high_pass,
"Band-Pass (300–3000 Hz)": band_pass
}
for i, (title, data) in enumerate(filters.items()):
# Waveform
librosa.display.waveshow(data, sr=sr, ax=axes[i, 0])
axes[i, 0].set_title(f"{title} - Waveform")
axes[i, 0].set_xlabel("Time (s)")
axes[i, 0].set_ylabel("Amplitude")
# Spectrogram
S = np.abs(librosa.stft(data))
S_db = librosa.amplitude_to_db(S, ref=np.max)
img = librosa.display.specshow(S_db, sr=sr, x_axis='time', y_axis='log', ax=axes[i, 1], cmap='magma')
axes[i, 1].set_title(f"{title} - Spectrogram")
fig.colorbar(img, ax=axes[i, 1], format='%+2.0f dB')
plt.tight_layout()
plt.show()
# ========================================
# 6. Frequency Variation Experiment (Playback Only - Optimized)
# ========================================
cutoff_values = [500, 1000, 2000]
for cutoff in cutoff_values:
print(f"\nTesting cutoff frequency: {cutoff} Hz")
low_f = butter_filter(y, sr, 'low', cutoff)
high_f = butter_filter(y, sr, 'high', cutoff)
band_f = butter_filter(y, sr, 'band', (cutoff / 2, cutoff * 1.5))
# Playback hasil filter
print(f"\n▶ Low-Pass {cutoff} Hz")
display(ipd.Audio(low_f, rate=sr))
print(f"\n▶ High-Pass {cutoff} Hz")
display(ipd.Audio(high_f, rate=sr))
print(f"\n▶ Band-Pass {cutoff/2:.0f}-{cutoff*1.5:.0f} Hz")
display(ipd.Audio(band_f, rate=sr))
# --- Visualisasi dalam satu figure ---
fig, axes = plt.subplots(3, 2, figsize=(14, 8))
fig.suptitle(f"Filter Comparison at {cutoff} Hz", fontsize=14)
filtered_audios = {
f"Low-Pass ({cutoff} Hz)": low_f,
f"High-Pass ({cutoff} Hz)": high_f,
f"Band-Pass ({int(cutoff/2)}-{int(cutoff*1.5)} Hz)": band_f
}
for i, (title, data) in enumerate(filtered_audios.items()):
# Waveform
librosa.display.waveshow(data, sr=sr, ax=axes[i, 0], max_points=20000)
axes[i, 0].set_title(f"{title} - Waveform")
axes[i, 0].set_xlabel("Time (s)")
axes[i, 0].set_ylabel("Amplitude")
# Spectrogram
S_db = librosa.amplitude_to_db(np.abs(librosa.stft(data)), ref=np.max)
img = librosa.display.specshow(S_db, sr=sr, x_axis='time', y_axis='log',
ax=axes[i, 1], cmap='magma')
axes[i, 1].set_title(f"{title} - Spectrogram")
fig.colorbar(img, ax=axes[i, 1], format='%+2.0f dB')
plt.tight_layout(rect=[0, 0, 1, 0.97])
plt.show()
plt.close(fig)
print("\n" + "="*50)
print("FILTERING COMPLETE")
print("="*50)
================================================== AUDIO INFORMATION ================================================== File Path : c:\Users\muham\OneDrive\Desktop\sistem-teknologi-multimedia-122140193\worksheet3\data\soal2.wav Sample Rate : 44100 Hz Duration : 11.05 s Samples : 487335 Channels : Mono ================================================== Menampilkan waveform dan spectrogram (original)...
================================================== ORIGINAL AUDIO PLAYBACK ==================================================
================================================== FILTER APPLICATION (Default Cutoffs) ================================================== ▶ Low-Pass Filter (1000 Hz)
▶ High-Pass Filter (1000 Hz)
▶ Band-Pass Filter (300–3000 Hz)
Testing cutoff frequency: 500 Hz ▶ Low-Pass 500 Hz
▶ High-Pass 500 Hz
▶ Band-Pass 250-750 Hz
Testing cutoff frequency: 1000 Hz ▶ Low-Pass 1000 Hz
▶ High-Pass 1000 Hz
▶ Band-Pass 500-1500 Hz
Testing cutoff frequency: 2000 Hz ▶ Low-Pass 2000 Hz